Cluster Selection in Divisive Clustering Algorithms
نویسندگان
چکیده
The problem this paper focuses on is the classical problem of unsupervised clustering of a data-set. In particular, the bisecting divisive clustering approach is here considered. This approach consists in recursively splitting a cluster into two sub-clusters, starting from the main data-set. This is one of the more basic and common problems in fields like pattern analysis, data mining, document retrieval, image segmentation, decision making, etc. ([13], [15]). Note that by recursively using a bisecting divisive clustering procedure, the data-set can be partitioned into any given number of clusters. Interestingly enough, the so-obtained clusters are structured as a hierarchical binary tree (or a binary taxonomy). This is the reason why the bisecting divisive approach is very attractive in many applications (e.g. in document-retrieval/indexing problems – see e.g. [23]). Any divisive clustering algorithm can be divided into two sub-problems: • the problem of selecting which cluster must be split; • the problem of how splitting the selected cluster. This paper focuses on the first sub-problem. In particular, in this paper a new method for the selection of the cluster to split is proposed. This method is here presented with
منابع مشابه
Choosing the cluster to split in bisecting divisive clustering algorithms
This paper deals with the problem of clustering a data-set. In particular, the bisecting divisive approach is here considered. This approach can be naturally divided into two sub-problems: the problem of choosing which cluster must be divided, and the problem of splitting the selected cluster. The focus here is on the first problem. The contribution of this work is to propose a new simple techn...
متن کاملCluster merging and splitting in hierarchical clustering algorithms
Hierarchical clustering constructs a hierarchy of clusters by either repeatedly merging two smaller clusters into a larger one or splitting a larger cluster into smaller ones. The crucial step is how to best select the next cluster(s) to split or merge. Here we provide a comprehensive analysis of selection methods and propose several new methods. We perform extensive clustering experiments to t...
متن کاملOn the performance of bisecting K - means and PDDP * Sergio
problem is known as bisecting divisive clustering. Note that by recursively using a divisive bisecting clustering procedure, the dataset can be partitioned into any given number of clusters. Interestingly enough, the clusters so-obtained are structured as a hierarchical binary tree (or a binary taxonomy). This is the reason why the bisecting divisive approach is very attractive in many applicat...
متن کاملA Multi-level Approach for Document Clustering
The divisive MinMaxCut algorithm of Ding et al. [3] produces more accurate clustering results than existing document cluster methods. Multilevel algorithms [4, 1, 5, 7] have been used to boost the speed of graph partitioning algorithms. We combine these two algorithms to construct faster and more accurate algorithm. In this new algorithm, the original graph is coarsened, partitioned by the divi...
متن کاملRobust DNA Microarray Clustering Techniques for Oncological Diagnosis
Machine learning techniques are increasingly popular tools for understanding complex biological data. Prior research has demonstrated the power of simple statistical clustering algorithms for disease class discovery and prediction. In this work we examine the efficacy of spectral and divisive clustering on gene expression microarray data. In particular we consider simultaneous expression cluste...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002